Sparse principal component analysis for multiblock data and its extension to sparse multiple correspondence analysis
نویسنده
چکیده
Two new methods to select groups of variables have been developed for multiblock data: ”Group Sparse Principal Component Analysis” (GSPCA) for continuous variables and ”Sparse Multiple Correspondence Analysis” (SMCA) for categorical variables. GSPCA is a compromise between Sparse PCA method of Zou, Hastie and Tibshirani and the method ”group Lasso” of Yuan and Lin. PCA is formulated as a regression-type optimization problem and uses the constraints of the group Lasso on regression coefficients to produce modified principal components with sparse loadings. It leads to reduce the number of nonzero coefficients, i.e. the number of selected groups. SMCA is a straightforward extension of GSPCA to groups of indicator variables, with the chi-square metric. Two real examples will be used to illustrate each method. The first one is a data set on 25 trace elements measured in three tissues of 48 crabs (25 blocks of 3 variables). The second one is a data set of 502 women aimed at the identification of genes affecting skin aging with more than 370.000 blocks, each block corresponding to SNPs (Single Nucleotide Polymorphisms) coded into 3 categories.
منابع مشابه
Sparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains
In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...
متن کاملA New IRIS Segmentation Method Based on Sparse Representation
Iris recognition is one of the most reliable methods for identification. In general, itconsists of image acquisition, iris segmentation, feature extraction and matching. Among them, iris segmentation has an important role on the performance of any iris recognition system. Eyes nonlinear movement, occlusion, and specular reflection are main challenges for any iris segmentation method. In thi...
متن کاملA New IRIS Segmentation Method Based on Sparse Representation
Iris recognition is one of the most reliable methods for identification. In general, itconsists of image acquisition, iris segmentation, feature extraction and matching. Among them, iris segmentation has an important role on the performance of any iris recognition system. Eyes nonlinear movement, occlusion, and specular reflection are main challenges for any iris segmentation method. In thi...
متن کاملSpeech enhancement based on hidden Markov model using sparse code shrinkage
This paper presents a new hidden Markov model-based (HMM-based) speech enhancement framework based on the independent component analysis (ICA). We propose analytical procedures for training clean speech and noise models by the Baum re-estimation algorithm and present a Maximum a posterior (MAP) estimator based on Laplace-Gaussian (for clean speech and noise respectively) combination in the HMM ...
متن کاملFunctional Modeling and Classification of Longitudinal Data
We review and extend some statistical tools that have proved useful for analyzing functional data. Functional data analysis primarily is designed for the analysis of random trajectories and infinite-dimensional data, and there exists a need for the development of adequate statistical estimation and inference techniques. While this field is in flux, some methods have proven useful. These include...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012